Search CORE

7 research outputs found

Multi-Modal Dataset Acquisition for Photometrically Challenging Object

Author: Busam Benjamin
Jung HyunJun
Navab Nassir
Ruhkamp Patrick
Publication venue
Publication date: 21/08/2023
Field of study

This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets. Our approach integrates robotic forward-kinematics, external infrared trackers, and improved calibration and annotation procedures. We present a multi-modal sensor rig, mounted on a robotic end-effector, and demonstrate how it is integrated into the creation of highly accurate datasets. Additionally, we introduce a freehand procedure for wider viewpoint coverage. Both approaches yield high-quality 3D data with accurate object and camera pose annotations. Our methods overcome the limitations of existing datasets and provide valuable resources for 3D vision research.Comment: Accepted at ICCV 2023 TRICKY Worksho

arXiv.org e-Print Archive

Polarimetric Information for Multi-Modal 6D Pose Estimation of Photometrically Challenging Objects with Limited Data

Author: Busam Benjamin
Gao Daoyi
Jung HyunJun
Navab Nassir
Ruhkamp Patrick
Publication venue
Publication date: 21/08/2023
Field of study

6D pose estimation pipelines that rely on RGB-only or RGB-D data show limitations for photometrically challenging objects with e.g. textureless surfaces, reflections or transparency. A supervised learning-based method utilising complementary polarisation information as input modality is proposed to overcome such limitations. This supervised approach is then extended to a self-supervised paradigm by leveraging physical characteristics of polarised light, thus eliminating the need for annotated real data. The methods achieve significant advancements in pose estimation by leveraging geometric information from polarised light and incorporating shape priors and invertible physical constraints.Comment: Accepted at ICCV 2023 TRICKY Worksho

arXiv.org e-Print Archive

Polarimetric Pose Prediction

Author: Busam Benjamin
Gao Daoyi
Guridi Arturo
Jung HyunJun
Li Yitong
Ruhkamp Patrick
Skobleva Iuliia
Wang Pengyuan
Wysock Magdalena
Publication venue
Publication date: 12/07/2022
Field of study

Light has many properties that vision sensors can passively measure. Colour-band separated wavelength and intensity are arguably the most commonly used for monocular 6D object pose estimation. This paper explores how complementary polarisation information, i.e. the orientation of light wave oscillations, influences the accuracy of pose predictions. A hybrid model that leverages physical priors jointly with a data-driven learning strategy is designed and carefully tested on objects with different levels of photometric complexity. Our design significantly improves the pose accuracy compared to state-of-the-art photometric approaches and enables object pose estimation for highly reflective and transparent objects. A new multi-modal instance-level 6D object pose dataset with highly accurate pose annotations for multiple objects with varying photometric complexity is introduced as a benchmark.Comment: Accepted at ECCV 2022; 25 pages (14 main paper + References + 7 Appendix

arXiv.org e-Print Archive

HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios

Author: Busam Benjamin
Garattoni Lorenzo
Jung HyunJun
Meier Sven
Navab Nassir
Rizzoli Giulia
Roth Daniel
Ruhkamp Patrick
Schieber Hannah
Wang Pengyuan
Wu Shun-Cheng
Zhai Guangyao
Zhao Hongcheng
Publication venue
Publication date: 26/04/2023
Field of study

Estimating the 6D pose of objects is a major 3D computer vision problem. Since the promising outcomes from instance-level approaches, research heads also move towards category-level pose estimation for more practical application scenarios. However, unlike well-established instance-level pose datasets, available category-level datasets lack annotation quality and provided pose quantity. We propose the new category-level 6D pose dataset HouseCat6D featuring 1) Multi-modality of Polarimetric RGB and Depth (RGBD+P), 2) Highly diverse 194 objects of 10 household object categories including 2 photometrically challenging categories, 3) High-quality pose annotation with an error range of only 1.35 mm to 1.74 mm, 4) 41 large-scale scenes with extensive viewpoint coverage and occlusions, 5) Checkerboard-free environment throughout the entire scene, and 6) Additionally annotated dense 6D parallel-jaw grasps. Furthermore, we also provide benchmark results of state-of-the-art category-level pose estimation networks

arXiv.org e-Print Archive

On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks

Author: Armagan Anil
Brasch Nikolas
Busam Benjamin
Ilic Slobodan
Jung HyunJun
Leonardis Ales
Li Yitong
Navab Nassir
Ruhkamp Patrick
Song Jifei
Verdie Yannick
Zhai Guangyao
Zhou Yiren
Publication venue
Publication date: 26/03/2023
Field of study

Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are typically not compared nor discussed in the literature due to a lack of multi-modal datasets. Texture-less regions are problematic for structure from motion and stereo, reflective material poses issues for active sensing, and distances for translucent objects are intricate to measure with existing hardware. Training on inaccurate or corrupt data induces model bias and hampers generalisation capabilities. These effects remain unnoticed if the sensor measurement is considered as ground truth during the evaluation. This paper investigates the effect of sensor errors for the dense 3D vision tasks of depth estimation and reconstruction. We rigorously show the significant impact of sensor characteristics on the learned predictions and notice generalisation issues arising from various technologies in everyday household environments. For evaluation, we introduce a carefully designed dataset\footnote{dataset available at https://github.com/Junggy/HAMMER-dataset} comprising measurements from commodity sensors, namely D-ToF, I-ToF, passive/active stereo, and monocular RGB+P. Our study quantifies the considerable sensor noise impact and paves the way to improved dense vision estimates and targeted data fusion.Comment: Accepted at CVPR 2023, Main Paper + Supp. Mat. arXiv admin note: substantial text overlap with arXiv:2205.0456

arXiv.org e-Print Archive

University of Birmingham Research Portal